REMARKS 



The claims have been amended by changing claims 1-14, canceling no claims, and 
adding a new claim 15. Claims 1-15 are in the application. 

Reconsideration of this application is respectfully requested. 

Informalities in the Abstract 

By lack of comment in the most recent Office Action about the Abstract, the applicant 
assumes that the examiner has accepted the applicants' proposal. 

Claim 1 1 was objected to under 37 C.F.R. 1 .75(a) as being unclear. 
Claim 11 has been amended to change its dependency to claim 10, as recommended by 
the examiner. 

Claim Rejections - 35 U.S.C. S 102(b): 

Claims 1, 4, 5, 9, and 14 were rejected under 35 U.S.C. § 102(b) as being clearly 
anticipated by Basu [US Patent 6,594,629, for the same reasons as in the previous Office 
Action. 

The last element of each of claims 1 and 14 has been changed very similarly to read 
approximately as changed in claim 1: " synchronously generating a sequence of a set of visemes 
wherein each set of visemes in the sequence is derived from a corresponding one of the time 
domain frame classification vectors." In addition, claims 1, 4, 5, 9, and 14 have been changed 
to specify that the "speech signal" is an "audio speech signal". Applicant believes these changes 
overcome the Examiner's argument in both of the Office Actions generated for this application 
(mailed June 28, 2005 and Nov. 14, 2005). 

In response to item b of the Examiner's reasons given that the amendment of Jan. 17, 
2006 did not place the application in a better condition for allowance, listed under Item 1 1 of the 
Advisory Action dated Jan. 24, 2006, Applicant asserts that a logical analysis of the wording of 
claims 1, 9, and 14 as presented in this amendment leads to an unavoidable conclusion that 
each set of visemes is derived from one frame of digitized analog speech information. In the 



second element of the claims (filtering), "each of the time domain frame classification vectors is 
derived from one of the successive frames of digitized analog speech information", and in the 
third element of the claims (synchronously generating), "each set of visemes in the sequence is 
derived from a corresponding one of the time domain frame classification vectors". Ergo, each 
set of visemes in the sequence is derived from one of the successive frames of digitized analog 
speech information. 

For these reasons, applicant believes that claims 1, and 14 are patentable over Basu 
and any combination of Basu and the art cited in this application. 

Applicant further believes that claims 4, 5, and 9 are patentable because they are 
dependent upon claim 1, which applicant believes is patentable. 



Claim Rejections - 35 U.S.C. § 102(eV 

Claims 1, 2, 9, 10, and 12-14 were rejected under 35 U.S.C. § 102(e) as being 
anticipated by Sutton [US Patent 6,539,354] using the same rationale as in the Office action 
mailed June 28, 2005, and for supplemental reasons. 

Applicant respectfully traverses the Examiner's rejection of claims 1, 2, 9, 10, and 12-14 
as being clearly anticipated by Sutton. Applicant believes that the Examiner has mis- 
characterized Sutton. Sutton, at col. 19, lines 1-13 characterizes the phoneme generation 
process as follows: 

Referring to FIG. 8, a speech input stream 2 B, or speech wave, is 
received into the system in 10 ms frames at a sampling rate of typically 
between 8 kHz to 45 kHz (depending on the system capability and the 
desired speech quality). A feature representation is computed for each 
frame and assembled into a content (feature) window 6. The feature 
window 6 contains 160 ms of speech information or, in other words, data 
from sixteen 10 ms frames. The feature window 6 is transmitted to a 
phonetic (phoneme) estimator 10B. The phoneme estimator 10B includes a 
phoneme neural network 16B which receives the feature window 6 as an 
input and produces context- dependent phoneme (phone) estimates 12 as 
an output. The phoneme estimates 12 are then sent to a viseme estimator 
30B. 

The viseme estimator 30B includes a viseme neural network 34B 
which takes the phoneme estimates 12 and produces viseme data 32 for 
the frames. The viseme data includes weighting information. 



Applicant analogized Sutton's window to a frame because in applicants' invention, the 
time domain classification vector, which is based on one frame of audio information, is that from 
which one set of visemes are generate, and in Suttton it is the content window from which the 
visemes are generated. The use by applicant of "frame" this way may have been a non-ideal 
choice. Applicant will hereafter use "frame" the way it is used in applicants' specification, in the 
manner that is well known in the art, in which it is the smallest set of digitized audio samples 
analyzed as a group by a digitized speech processor, and in some embodiments is a set of 
digital samples representing 10 msec of audio speech (applicants' specification, page 3, lines 
26-33). The point to be made is that Sutton's visemes are generated based on a "content 
window" rate (every 160 msec), generating visemes at the rate of every 160 msec or perhaps at 
an irregular phoneme rate, but clearly not one viseme or set of visemes for every frame as 
described by Sutton. (This last statement has been revised in response to Examiner's reason 
1 1 c in the Advisory Action to clarify the meaning.) 

Applicant's claim 1 makes it clear that a set of visemes are generated every frame. (This 
last statement has been revised in response to Examiner's reason 1 1 d in the Advisory Action to 
clarify the meaning.) 

For these reasons, applicant believes that claims 1 and 14 are patentable over Sutton 
and any combination of Sutton and the art cited in this application. 

Applicant believes that claims 2, 9, and 10 are patentable because they are dependent 
upon claim 1, which applicant believes is patentable. 

Applicant believes that claims 1 2 and 1 3 are patentable for the same reasons as claim 1 . 

Claim Rejections - 35 U.S.C. § 103: 

Claims 6-8 were rejected under 35 U.S.C. § 103 as being unpatentable over Basu (US 
Patent 6,594,629) in view of David J. Thomson , "An Overview of Multiple-Window and 
Quadratic-Inverse Spectrum Estimation Methods," IEEE 1994, pp. VI 185-VI 194. 

Applicant believes that claims 6-8 are patentable because combining Thomson with 
Basu fails because Basu fails for the reasons described above with reference to the rejection of 
Claims 1 and 14 over Basu. 

Notwithstanding these reasons, applicant believes that claims 6-8 are patentable on their 
own merits, and respectfully traverses Examiner's rejection thereof, for the reason that 



Thomson does not provide a motivation of using N MTDPSSB functions in applicants' claimed 
invention. In fact, the advantage described in Thompson of using MTDPSSB functions, which is 
the advantage of achieving the best possible leakage properties for a dynamic range, was 
purposefully sacrificed in order to perform the computation of the MTDPSSB with the low 
latency needed to achieve synchronization. 



Accordingly, this application is believed to be in proper form for allowance and an early notice of 
allowance is respectfully requested. 

Please charge any fees associated herewith, including extension of time fees, to 

502117, 

Respectfully submitted, 
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